智能论文笔记

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Clustering Interval-Censored Time-Series for Disease Phenotyping

Irene Y. Chen , Rahul G. Krishnan , David Sontag

分类： (统计)机器学习 | 机器学习

2021-02-13

无监督的学习通常用于揭示数据中的群集。然而，不同类型的噪声可能会妨碍来自真实世界的时间序列数据的有用模式的发现。在这项工作中，我们专注于减轻疾病表型群体任务中的间隔审查的干扰。我们开发了一个深入的生成，连续时间模型，时间序列数据串联时间系列，同时纠正审查时间。我们提供了在无噪声模型下的数据中识别群集和延迟条目的条件。

translated by 谷歌翻译

Deep Learning for Space Weather Prediction: Bridging the Gap between Heliophysics Data and Theory

John C. Dorelli , Chris Bard , Thomas Y. Chen , Daniel Da Silva , Luiz Fernando Guides dos Santos , Jack Ireland , Michael Kirk , Ryan McGranaghan , Ayris Narock , Teresa Nieves-Chinchilla

分类：机器学习

2022-12-27

Traditionally, data analysis and theory have been viewed as separate disciplines, each feeding into fundamentally different types of models. Modern deep learning technology is beginning to unify these two disciplines and will produce a new class of predictively powerful space weather models that combine the physical insights gained by data and theory. We call on NASA to invest in the research and infrastructure necessary for the heliophysics' community to take advantage of these advances.

translated by 谷歌翻译

Heliophysics Discovery Tools for the 21st Century: Data Science and Machine Learning Structures and Recommendations for 2020-2050

R. M. McGranaghan , B. Thompson , E. Camporeale , J. Bortnik , M. Bobra , G. Lapenta , S. Wing , B. Poduval , S. Lotz , S. Murray

分类：人工智能 | 机器学习

2022-12-26

Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.

translated by 谷歌翻译

Artificial Intelligence to Enhance Mission Science Output for In-situ Observations: Dealing with the Sparse Data Challenge

M. I. Sitnov , G. K. Stephens , V. G. Merkin , C. -P. Wang , D. Turner , K. Genestreti , M. Argall , T. Y. Chen , A. Y. Ukhorskiy , S. Wing

分类：机器学习

2022-12-26

In the Earth's magnetosphere, there are fewer than a dozen dedicated probes beyond low-Earth orbit making in-situ observations at any given time. As a result, we poorly understand its global structure and evolution, the mechanisms of its main activity processes, magnetic storms, and substorms. New Artificial Intelligence (AI) methods, including machine learning, data mining, and data assimilation, as well as new AI-enabled missions will need to be developed to meet this Sparse Data challenge.

translated by 谷歌翻译

Exponentially Improving the Complexity of Simulating the Weisfeiler-Lehman Test with Graph Neural Networks

Anders Aamand , Justin Y. Chen , Piotr Indyk , Shyam Narayanan , Ronitt Rubinfeld , Nicholas Schiefer , Sandeep Silwal , Tal Wagner

分类：机器学习 | (统计)机器学习

2022-11-06

Recent work shows that the expressive power of Graph Neural Networks (GNNs) in distinguishing non-isomorphic graphs is exactly the same as that of the Weisfeiler-Lehman (WL) graph test. In particular, they show that the WL test can be simulated by GNNs. However, those simulations involve neural networks for the 'combine' function of size polynomial or even exponential in the number of graph nodes $n$, as well as feature vectors of length linear in $n$. We present an improved simulation of the WL test on GNNs with \emph{exponentially} lower complexity. In particular, the neural network implementing the combine function in each node has only a polylogarithmic number of parameters in $n$, and the feature vectors exchanged by the nodes of GNN consists of only $O(\log n)$ bits. We also give logarithmic lower bounds for the feature vector length and the size of the neural networks, showing the (near)-optimality of our construction.

translated by 谷歌翻译

RARR: Researching and Revising What Language Models Say, Using Language Models

Luyu Gao , Zhuyun Dai , Panupong Pasupat , Anthony Chen , Arun Tejasvi Chaganty , Yicheng Fan , Vincent Y. Zhao , Ni Lao , Hongrae Lee , Da-Cheng Juan

分类：自然语言处理 | 人工智能 | 机器学习

2022-10-17

Language models (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.

translated by 谷歌翻译

Volumetric-based Contact Point Detection for 7-DoF Grasping

Junhao Cai , Jingcheng Su , Zida Zhou , Hui Cheng , Qifeng Chen , Michael Y Wang

分类：机器人

2022-09-14

在本文中，我们提出了一条基于截短的签名距离函数（TSDF）体积的接触点检测的新型抓紧管道，以实现闭环7度自由度（7-DOF）在杂物环境上抓住。我们方法的关键方面是1）提议的管道以多视图融合，接触点采样和评估以及碰撞检查，可提供可靠且无碰撞的7-DOF抓手姿势，并带有真实的碰撞 - 时间性能；2）基于接触的姿势表示有效地消除了基于正常方法的歧义，从而提供了更精确和灵活的解决方案。广泛的模拟和实体机器人实验表明，在模拟和物理场景中，就掌握成功率而言，提出的管道可以选择更多的反物和稳定的抓握姿势，并优于基于正常的基线。

translated by 谷歌翻译

Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics

C. Chen , Y. P. Huang , W. H. K. Lam , T. L. Pan , S. C. Hsu , A. Sumalee , R. X. Zhong

分类：机器学习

2022-09-13

现有的数据驱动和反馈流量控制策略不考虑实时数据测量的异质性。此外，对于缺乏数据效率，传统的加固学习方法（RL）方法通常会缓慢收敛。此外，常规的最佳外围控制方案需要对系统动力学的精确了解，因此对内源性不确定性会很脆弱。为了应对这些挑战，这项工作提出了一种基于不可或缺的增强学习（IRL）的方法来学习宏观交通动态，以进行自适应最佳周边控制。这项工作为运输文献做出了以下主要贡献：（a）开发连续的时间控制，并具有离散增益更新以适应离散时间传感器数据。（b）为了降低采样复杂性并更有效地使用可用数据，将体验重播（ER）技术引入IRL算法。（c）所提出的方法以“无模型”方式放松模型校准的要求，该方式可以稳健地进行建模不确定性，并通过数据驱动的RL算法增强实时性能。（d）通过Lyapunov理论证明了基于IRL的算法和受控交通动力学的稳定性的收敛性。最佳控制定律被参数化，然后通过神经网络（NN）近似，从而缓解计算复杂性。在不需要模型线性化的同时，考虑了状态和输入约束。提出了数值示例和仿真实验，以验证所提出方法的有效性和效率。

translated by 谷歌翻译

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

R. Abbasi , M. Ackermann , J. Adams , N. Aggarwal , J. A. Aguilar , M. Ahlers , M. Ahrens , J. M. Alameddine , A. A. Alves Jr. , N. M. Amin

分类：机器学习

2022-09-07

ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列，该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战，这是由于探测器的几何形状，不均匀的散射和冰中光的吸收，并且低于100 GEV的光，每个事件产生的信号光子数量相对较少。为了应对这一挑战，可以将ICECUBE事件表示为点云图形，并将图形神经网络（GNN）作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开，对不同的中微子事件类型进行分类，并重建沉积的能量，方向和相互作用顶点。基于仿真，我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术，包括已知系统不确定性的影响。对于中微子事件分类，与当前的IceCube方法相比，GNN以固定的假阳性速率（FPR）提高了信号效率的18％。另外，GNN在固定信号效率下将FPR的降低超过8（低于半百分比）。对于能源，方向和相互作用顶点的重建，与当前最大似然技术相比，分辨率平均提高了13％-20％。当在GPU上运行时，GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件，这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。

translated by 谷歌翻译